Site Health: Cache the full results and a timestamp in the status transient#12181
Site Health: Cache the full results and a timestamp in the status transient#12181gziolo wants to merge 11 commits into
Conversation
…nsient. Previously the `health-check-site-status-result` transient stored only the aggregate `good`, `recommended`, and `critical` counts, so any consumer that needed the underlying results had to re-run the Site Health tests synchronously. The scheduled check now caches the complete results array and the time they were collected alongside the counts. The Site Health screen AJAX handler validates the submitted counts and preserves the cached results and timestamp while refreshing the counts, and `enqueue_scripts()` only localizes the counts to avoid embedding the full results in every Site Health and Dashboard page. This provides a reusable cached source of detailed Site Health data for the dashboard, the admin menu, and future REST/Abilities consumers without triggering synchronous tests. Adds unit tests covering the scheduled-check caching and the AJAX result handler. See #65232. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
Hi there! 👋 Thank you for your contribution to WordPress! 💖 It looks like this is your first pull request to No one monitors this repository for new pull requests. Pull requests must be attached to a Trac ticket to be considered for inclusion in WordPress Core. To attach a pull request to a Trac ticket, please include the ticket's full URL in your pull request description. Pull requests are never merged on GitHub. The WordPress codebase continues to be managed through the SVN repository that this GitHub repository mirrors. Please feel free to open pull requests to work on any contribution you are making. More information about how GitHub pull requests can be used to contribute to WordPress can be found in the Core Handbook. Please include automated tests. Including tests in your pull request is one way to help your patch be considered faster. To learn about WordPress' test suites, visit the Automated Testing page in the handbook. If you have not had a chance, please review the Contribute with Code page in the WordPress Core Handbook. The Developer Hub also documents the various coding standards that are followed:
Thank you, |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
…|string`, but `mixed` given.
| /* | ||
| * Only the aggregate counts are refreshed here. Preserve the full test results | ||
| * and the timestamp recorded by the scheduled check so they are not discarded | ||
| * when the counts are updated from the Site Health screen. | ||
| */ | ||
| $cached = get_transient( 'health-check-site-status-result' ); | ||
| $cached = is_string( $cached ) ? json_decode( $cached, true ) : null; | ||
|
|
||
| if ( is_array( $cached ) ) { | ||
| if ( isset( $cached['results'] ) && is_array( $cached['results'] ) ) { | ||
| $site_status['results'] = $cached['results']; | ||
| } | ||
|
|
||
| if ( isset( $cached['timestamp'] ) ) { | ||
| $site_status['timestamp'] = (int) $cached['timestamp']; | ||
| } | ||
| } | ||
|
|
||
| set_transient( 'health-check-site-status-result', wp_json_encode( $site_status ) ); |
There was a problem hiding this comment.
Aren't the results here superior to the results obtained via wp_cron_scheduled_check()? They include the results for the async tests which require JS to run and thus cannot be computed via WP Cron. Should the logic here be instead moved to wp_cron_scheduled_check() so that async results are preserved after WP Cron runs?
Aside: we may want to add the timestamp as a key inside each result. This would allow us to drop test results which are overly stale (e.g. if a plugin with a test was deactivated).
westonruter
left a comment
There was a problem hiding this comment.
In addition to my comment about letting the ajax-stored result be preferred, here's a critical observation that Claude made: the lack of an $expiration means that the transient will be autoloaded.
Since it now includes all of the results which could potentially be large, this would run the risk of increasing the bucket of autoloaded options beyond the 1MB limit imposed by Memcached, right?
Full review from Claude
Confirmed the critical detail: set_transient() is called with no expiration ($expiration = 0), so without an external object cache the option is stored with $autoload = true (option.php:1547-1552). The full results blob is therefore autoloaded on every page load site-wide, not just on Site Health screens. That makes the size concern much more severe than the finders estimated.
I have enough verified context. Here are the findings, ranked most-severe first.
Code Review: PR #12181 — Cache Site Health full results + timestamp
Findings
1. Full results blob is autoloaded on every page request (perf regression) — class-wp-site-health.php:3492
set_transient( 'health-check-site-status-result', wp_json_encode( $site_status ) ) is called with no expiration. With no external object cache, set_transient() stores the option with autoload = true (option.php:1547-1552). Previously the value was ~60 bytes (three ints); now it's the entire $results array including every test's HTML description, actions, and badge. That payload is now loaded into memory on every front-end and admin request for every visitor, via wp_load_alloptions() — a site-wide cost paid to store data only the Site Health screen reads (and only the 3 counts from it). Passing an expiration, or storing under a non-autoloaded option, would avoid this.
2. Large results payload can exceed object-cache item limits → counter silently reverts to 0 — class-wp-site-health.php:3489
On installs with a persistent object cache (e.g. Memcached, default 1MB item limit), a site with many site_status_tests (plugins add them, with arbitrary HTML) can produce a JSON blob exceeding the backend's per-item limit. wp_cache_set() then silently fails, get_transient() returns false, and the admin-menu/dashboard counters fall back to 0/0/0 even though valid counts existed. No size cap or truncation is applied.
3. Race between scheduled cron and the AJAX counts update can discard freshly collected results — src/wp-admin/includes/ajax-actions.php:5488
The AJAX handler does a non-atomic read-modify-write: it reads the cached transient, splices in the POSTed counts, and re-writes. If wp_cron_scheduled_check() writes fresh results between the handler's get_transient() and set_transient() (admin loads Site Health while cron runs), the handler overwrites with the older (or absent) results it read earlier — silently dropping the detailed results the ticket exists to preserve, until the next weekly cron.
4. New status guard miscounts unknown/empty statuses and diverges from results count — class-wp-site-health.php:3471
The guard if ( ! is_array( $result ) || ! isset( $result['status'] ) ) continue; only checks that status is set. A result with status => '' or a custom value like 'info' still falls through to the final else and increments good, mislabeling it as a passing check. Separately, a result missing status was previously counted as good (old loop had no guard) but is now skipped entirely — so the cached results array length (e.g. 3) can no longer equal good + recommended + critical, which will surprise any consumer that assumes they match.
5. timestamp freshness mechanism is half-built — class-wp-site-health.php:3490
The comment says the timestamp lets freshness "be evaluated," but: the transient is set with no expiry, nothing in core ever compares the timestamp to now, and the AJAX handler preserves the old cron timestamp on every counts update (ajax-actions.php:5495-5497). So the stored timestamp reflects the last full cron run, not the last update — any future freshness check built on it would be subtly wrong, and the promised mechanism doesn't actually exist yet.
6. Payload schema duplicated across writers/readers with no canonical accessor — src/wp-admin/includes/ajax-actions.php:5477
The count-normalization array( 'good' => (int) ( $x['good'] ?? 0 ), 'recommended' => …, 'critical' => … ) is copied verbatim into ajax-actions.php:5477-5481 and class-wp-site-health.php:126-130, and the get_transient() + json_decode( …, true ) idiom is duplicated too. The transient's shape is now defined by convention across five sites (cron writer, ajax merge, enqueue reader, menu.php, dashboard.php). Adding a status bucket or renaming a key requires editing several hand-maintained literals; missing one yields counts that silently disagree between the menu counter, dashboard widget, and Site Health screen. A single read/write/normalize helper would centralize it.
7. Enqueue decodes the entire results blob just to read three integers — class-wp-site-health.php:117
On every Site Health and Dashboard page render, json_decode() parses the full cached payload (all HTML results) only to extract good/recommended/critical — the comment itself notes "only the aggregate counts are needed." The decode cost scales with the size/number of test results, for no benefit. (Mitigated if findings 1 & 2 are addressed by not storing results in the same hot transient.)
8. AJAX handler re-encodes the full results blob on every counts POST — src/wp-admin/includes/ajax-actions.php:5501
Each time an admin loads the Site Health Status tab, the JS posts the counts and this handler reads, decodes, and re-wp_json_encode()s the entire results payload back into the transient — repeated large serialization writes per page view, not just per weekly cron run (write amplification).
Note on a dropped candidate: finders flagged wp_json_encode( $site_status ) returning false (and storing false) on invalid UTF-8 in result HTML. I refuted it — wp_json_encode() runs _wp_json_sanity_check()/_wp_json_convert_string() to repair invalid UTF-8 and re-encode (functions.php:4411-4417), so the common case recovers; only a >512-depth structure would fail, which Site Health results never reach.
Findings 1–4 are the ones I'd block on; finding 1 in particular looks like a meaningful site-wide performance regression worth raising on the PR. Want me to post these as inline PR comments (/code-review --comment) or open a discussion on the specific lines?
|
Thanks for the review! Reworked this substantially in Two design calls I'd value your take on before going further: the merge policy (1-week grace window vs. browser strictly outranking cron) and whether to cap the detail-cache size. Both noted in the description. Disclosure: I used Claude Code and Codex to develop and validate these changes (reviewed and verified by me). 🙏 |
…ansient. Following review feedback, store only the aggregate counts in the autoloaded `health-check-site-status-result` transient and cache the full per-test results separately in `health-check-site-status-detail`, which has an expiration so the larger payload is not autoloaded on every request. The Site Health screen submits the full results — including the asynchronous tests that require JavaScript to run — and they are sanitized and cached as the authoritative detailed results. The scheduled check refreshes only missing or stale entries so it does not discard fresher results collected from the screen. Each result carries its own timestamp, and entries that have not been refreshed within a month are dropped. Adds WP_Site_Health::get_site_status_counts(), get_site_status_detail(), and merge_site_status_detail() as the canonical accessors, and updates the unit and Ajax tests accordingly. See #65232. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5bb9fee to
70b0566
Compare
Trac ticket: https://core.trac.wordpress.org/ticket/65232
What & why
Trac #65232 wants a fast, cache-only Site Health summary for
core/get-environment-infothat never runs synchronous tests. Today thehealth-check-site-status-resulttransient stores only the aggregategood/recommended/criticalcounts, so any consumer that needs the underlying results has to re-run the tests.As discussed on the ticket, the first step is to cache the full results (so a consumer — and later Trac #64066's page-cache detail — can reuse them) along with a timestamp so staleness can be judged. This PR delivers that caching foundation only. The Abilities
site_healthfield andfieldsinput are a follow-up that reads this cache.Design
Two caches, with clearly separated jobs:
health-check-site-status-result— unchanged shape: only{ good, recommended, critical }. Small, still autoloaded, backward-compatible for the admin-menu counter and Dashboard widget.health-check-site-status-detail(new) — the full per-test results, keyed by test, each with its owntimestamp. Stored with an expiration so it is not autoloaded (the larger payload must not load on every request).get_site_status_detail()also returnscountsderived from those same results, so a consumer reads one internally consistent view.Who writes what
Detailed cache shape
{ "results": { "<test>": { "test": "…", "label": "…", "status": "good|recommended|critical", "badge": { "label": "…", "color": "…" }, "description": "…", "actions": "…", "timestamp": 1715714399 } }, "counts": { "good": 12, "recommended": 3, "critical": 0 }, "timestamp": 1715714399 }Canonical accessors (on
WP_Site_Health)get_site_status_counts()/set_site_status_counts()— the legacy UI counts (the small autoloaded transient).get_site_status_detail()—{ results, counts, timestamp }, internally consistent.update_site_status_detail( $results, $authoritative )— sanitizes (wp_kses_post+ field sanitizers) and merges into the detail cache.Backward compatibility
The counts transient keeps its existing shape, so
menu.php, the Dashboard widget, and the localizedSiteHealthdata are unchanged. NoWP_Site_Healthload is forced on the admin-menu path.Open questions for reviewers
sourcefield)?Testing
tests/phpunit/tests/admin/wpSiteHealth.php— scheduled-check counts vs. detail split, the fresh-preserve / stale-refresh / stale-drop behavior, derived detail counts, and the unavailable-async fallback being cached under its test identifier.tests/phpunit/tests/ajax/wpAjaxHealthCheckSiteStatusResult.php— independent counts-only and results-only requests, HTML sanitization, capability enforcement.phpcs, PHPStan, and the full
site-healthandajaxtest groups pass locally.Use of AI Tools
AI assistance: Yes
Tool(s): Claude Code (Claude Opus 4.8), Codex
Used for: Development and validation; reviewed and verified by me.
🤖 Generated with Claude Code